6 - Deep Learning - Feedforward Networks Part 1 [ID:16750]

50 von 156 angezeigt

Welcome everybody to our lecture on deep learning. Today we want to go into the topic and we want to introduce some of the important concepts and theory that have been fundamental for the field.

Okay, today's topic will be feed forward networks. And feed forward networks are essentially the main configuration of neural networks as we use them today.

So in the next couple of videos we want to talk about first the model and some ideas behind it, also introduce a bit of theory.

One important block will be about universal function approximation where we will essentially show that neural networks are able to approximate any kind of function.

This will then be followed by the introduction of the softmax function and introduction of some activations.

And in the end we want to talk a bit about how to optimize such parameters.

And in particular we will talk about the back propagation algorithm.

So let's start with the model. And what you heard already is a perceptron. We already talked about this.

Which was essentially a function that would map any input, any high dimensional input using weights and compute an inner product of a weight vector with the input.

And then we are only interested in the signed distance that is computed this way and you can interpret this essentially as you see here on the right hand side.

And this is a line. So this is the decision boundary here shown in red.

What you're computing with this inner product is essentially a distance, a signed distance of a new sample to this decision boundary.

And if we then only look at the sign, we will decide whether we are on the one side or the other side.

Now if you look at classical pattern recognition and machine learning, we are still in this domain right now where we would typically follow a so-called pattern recognition pipeline.

Where we have some measurement that is converted and then pre-processed in order to increase the quality, decrease noise.

But in the pre-processing you essentially stay in the same domain as the input.

So if you have an image as an input, the output of the pre-processing will also be an image, but with probably better properties towards the classification task.

Then we want to do feature extraction. You remember the example with the apples and pears.

And from these we extract those features, which then results in some high dimensional vector space.

And in this vector space, we can then go ahead and do the classification.

Now what we've seen in the perceptron is that we are able to model linear decision boundaries.

And this immediately then led to the observation that perceptrons cannot solve the logical exclusive OR, the so-called XOR.

And you can see the visualization of the XOR problem here on the left hand side.

So imagine you have some kind of distribution of classes where the top left and the bottom right is blue and the other class is bottom left and top right.

So if you look at this, this is inspired by the logical XOR function, then you will not be able to separate those two point clouds with a single linear decision boundary.

So you either need curves or what can help you in this kind of constellation is you use multiple lines.

But with a single perceptron, you will not be able to learn to solve this problem.

Because people have been arguing, look, we can model logical functions with perceptrons.

And then if we build perceptrons and perceptrons, we can essentially build all of logic.

Now, if you can't build XOR, then you're probably not able to describe the entire logic.

And therefore we will never get there.

This was a period of time where funding to artificial intelligence research was tremendously cut down and people would not get any new grants.

They would not get money to support the research.

This period became known as the AI winter.

Things changed with the introduction of the multilayer perceptron.

This is now the expansion of the perceptron.

But you not just do a single neuron, but you use multiple of those neurons and you arrange them in layers.

So here you can see a very simple draft.

So this is very similar to the perceptron.

You have essentially some inputs, some weights, and now you can see that it's not just a single sum.

But we have several of those sums that go through nonlinearity.

And then they are assigned weights again, summed up again and go into another nonlinearity.

So this is very interesting because with multiple neurons, we can now also model nonlinear decision boundaries.

You can go on and then arrange this in layers.

So what you typically do is you have some input layer.

This is our vector X.

And then you have several perceptrons that you arrange in hidden layers.

They're called hidden because they do not immediately observe the input, but they assign weights, then compute something.

And only at the very end at the output you have a layer again where you can observe what's actually happening.

And all of these weights that are in between in those hidden layers, they are not directly observable.

Teil einer Videoserie :

Deep Learning - Plain Version

Presenters

Prof. Dr.-Ing. Andreas Maier

Zugänglich über

Offener Zugang

Dauer

00:17:38 Min

Aufnahmedatum

2020-05-28

Hochgeladen am

2020-05-28 10:46:35

Sprache

en-US

Deep Learning - Feedforward Networks Part 1 This video introduces the topic of feedforward networks, universal approximation, and how to map a decision tree onto a neural network. Further Reading A gentle Introduction to Deep Learning

Tags

Per RSS abonnieren